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In this research, we used a dataset from Siksha ‘O° Anusandhan (S’O’A) 
University Medical Laboratory containing 6,780 samples collected manually 
and through internet of things (IoT) sensor sources from 6,780 patients to 
perform a thorough investigation into liver disease stage prediction. The 
dataset was carefully cleaned before being sent to the machine learning 
pipeline. We utilised a range of machine learning models, such as Naive 
Bayes (NB), sequential minimal optimisation (SMO), K-STAR, random 
forest (RF), and multi-class classification (MCC), using Python to predict 
the stages of liver disease. The results of our simulations demonstrated how 
well the SMO model performed in comparison to other models. We then 
expanded our analysis using different machine learning boosting models 
with SMO as the base model: adaptive boosting (AdaBoost), gradient boost, 
extreme gradient boosting (XGBoost), CatBoost, and light gradient boosting 
model (LightGBM). Surprisingly, gradient boost proved to be the most 
successful, producing an astounding 96% accuracy. A closer look at the data 
showed that when AdaBoost was combined with the SMO base model, the 
accuracy results were 94.10%, XGBoost 90%, CatBoost 92%, and 
LightGBM 94%. These results highlight the effectiveness of proposed model 
i.e. gradient boosting in improving the prediction of liver disease stage and 
provide insightful information for improving clinical decision support 
systems in the field of medical diagnostics. 
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1. INTRODUCTION 


Millions of people worldwide are affected by liver disease, which also places a heavy cost on 
healthcare systems [1] around the world. Accurate staging is required for effective care of liver diseases, in 
addition to prompt diagnosis, in order to direct the right clinical interventions. While reliable, conventional 
techniques of liver disease diagnosis have some drawbacks, particularly when it comes to assessing the 
severity and course of the disease. In this context, combining machine learning and internet of things (IoT) 
technology presents a viable path for enhancing patient care and diagnostic accuracy. 
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The goal of this study is to use IoT secure framework [2] and machine learning to predict the stages 
of liver disease [3], taking into account the important differences between stages 1 to 4. Also, this study 
investigates the viability of using a collection of machine learning models to accomplish this crucial 
diagnostic goal by leveraging a real time dataset of 6,780 samples which are obtained through a combination 
of manual sample tests and data collected through IoT sensors from the patients of Siksha ‘O’ Anusandhan 
(S’O’A) University Medical Laboratory, Bhubaneswar. The potential to improve liver disease diagnosis' 
precision and effectiveness, leading to better patient care and therapeutic results [4], is what spurred this 
Endeavour. 

The real time dataset under investigation includes a wide range of 15 different parameters collected 
through IoT sensors that have been carefully chosen to capture a comprehensive picture of liver health, 
including the number of days the patient has been receiving treatment, ascites, age, sex, hepatomegaly, 
spiders, edoema, bilirubin, cholesterol, albumin, copper, alk_ phospho, serum glutamic oxalacetic 
transaminase (SGOT), tryglicerides, platelets, and prothrombin. When taken as a whole, these parameters 
provide a comprehensive understanding of the clinical, biochemical and demographic profile of the patient. A 
liver disease's precise identification and assessment [5], [6] is made possible by the presence of specific 
clinical symptoms in conjunction with elevated levels of certain markers, such as SGOT and bilirubin, which 
can help with diagnosis and treatment. These metrics range from more modern data collected from IoT 
devices like in Figure 1, such as continuous monitoring of vital signs and other pertinent health indicators [7], 
[8], to more classic clinical measures, such as liver enzyme levels and bilirubin concentration. This study 
aims to utilise the complete range of information available for forecasting the stages of liver disease by 
combining the traditional and contemporary elements of healthcare data. 
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Figure 1. Cloud service in health care 


Traditional machine learning models [9] like Naïve Bayes (NB), sequential minimal optimisation 
(SMO), K star (K*) and random forest (RF) are used to accomplish this predicting task. The predictive power 
of this study is derived from these models, which were chosen for their diversity and proven efficacy in 
classification tests. We dive deeper into the world of boosting strategies, however, as we are aware that 
predicted performance may always be improved. We intend to improve the accuracy and robustness of the 
chosen traditional models in predicting the stages of liver disease by employing boosting techniques like 
adaptive boosting (AdaBoost) and gradient boosting on the result traditional model. 

This research makes a substantial contribution to the diagnosis and stage prediction of liver disease. 
The findings have the potential to fundamentally alter how liver disease is managed by giving clinicians a 
quicker and more precise tool for diagnosing and monitoring disease progression. Additionally, the 
incorporation of IoT sensors [10] into the diagnostic procedure highlights the possibility for ongoing patient 
monitoring, ushering in a new era of individualised healthcare (Figure 2). In bellow methodology, 
experimental plan and findings sections of this study, providing light on the effectiveness of conventional 
machine learning models and the revolutionary potential of boosting techniques which is used to forecast 
liver disease [11]. This research is essential for boosting patient outcomes and raising the standard of care for 
liver illnesses because of its potential effects on healthcare and the management of liver disease in the future. 

It is a popular machine learning method called NB is extensively used for classification jobs [12]. 
This algorithm is based on the Bayes theorem and simplifies by assuming feature independence, even though 
this assumption might not hold true in real-world situations. Support vector machines (SVMs), which are 
frequently used in machine learning for classification and regression problems [13], are trained using the 
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specialised technique SMO. The quadratic optimisation problem that arises during SVM training must be 
efficiently resolved and SMO is essential to this process. K* is a k-nearest neighbour (KNN)-based instance- 
based classifier. It makes an effort to cluster k data points into n data points. K* uses an entropic distance 
measurement depending on the probability of transforming one occurrence into another. RF is another 
ensemble learning method that is used for a variety of tasks, including classification and regression. During 
the training phase of this approach, several decision trees are created. The class that appears the most 
frequently throughout the trees is chosen to produce the RF output in classification scenarios. A useful 
indicator for evaluating the effectiveness of classification models, particularly in binary classification 
scenarios, is the matthews correlation coefficient (MCC) [14]. It evaluates the model's ability to predict 
outcomes by taking into account crucial elements like true positives (cases that were correctly predicted as 
positive), true negatives (cases that were correctly predicted as negative), false positives (cases that were 
incorrectly labelled as positive when they were negative) and false negatives (cases that were incorrectly 
labelled as negative when they were positive). 


"Glucose Sensor 


Blood Prassuro Sensor 
(Sphygmomanometer) : EAS n 
y» Activity and Mobility 
Sensors 


Zigbeo Low Energy 

Z-Wave ye - Aa 

LoRa M 

SigFox WiMax Personal 

Ingenu 2G/4G ISG Digital Assistant į Liver Fibrosis Sensors ang 


Bluetooth ; a 
S oY ~~ ie Temperature \ me CED 
Sensors EnSOIS 


x 
Medical Doctor 


Medical Server Bilirubin Sensors 


Figure 2. Internet of medical things (IoMT) architecture 


Attributes for performance comparison: i) mean absolute error (MAE). The term "MAE" refers to 
the average of errors between paired observations that capture the same phenomena. It is one important 
indicator for assessing how well machine learning models work is MAE. In a dataset, it measures the mean 
absolute difference between the expected and actual values. In contrast to other error metrics, MAE gives a 
clear picture of the correctness of the model without taking the errors' direction into account. Since a lower 
MAE means that the model's predictions are often closer to the true values, it can be used to determine 
whether the model is producing more accurate predictions. MAE is very helpful in datasets with outliers 
since it assigns equal weight to each error. As such, it is a trustworthy metric for evaluating a machine 
learning model's overall effectiveness across a range of applications and domains; ii) root mean square error 
(RMSE). It is one of the most important metrics for evaluating the effectiveness of machine learning models 
is the RMSE. It computes the square root of the mean squared deviation between a dataset's actual and 
expected values. By penalizing greater errors more severely than smaller ones, RMSE offers a thorough 
picture of the accuracy of the model. Because of this, it is more susceptible to outliers and the effects of 
significant errors on the model's overall rating are magnified. A smaller RMSE denotes more accuracy since 
it shows that the model's predictions are more accurate when compared to the real data. RMSE is a 
commonly used machine learning metric that offers a reliable way to assess model performance and make 
efficient comparisons between different techniques; iii) relative absolute error (RAE). It is one of the 
important metric for assessing how well machine learning models are doing is the RAE. The ratio of the 
mean absolute error to the mean absolute value of the dataset's actual values is what it represents. RAE 
provides a normalized evaluation of model accuracy, allowing comparisons between various datasets and 
domains. A lower RAE denotes greater model performance, and values nearer zero imply that the model's 
predictions and actual values are closely aligned. Since RAE takes into consideration relative error rather 
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than just absolute error, it is very helpful when working with datasets of different sizes and scales. 
Consequently, RAE is an effective instrument for evaluating and optimizing machine learning models in a 
variety of contexts and applications; and iv) root relative squared error (RRSE). It is one of the important 
indicator for assessing how well machine learning models perform is the RRSE. It computes the square root 
of the ratio between the variance of the dataset's actual values and the mean squared error of the model. By 
taking into consideration the variability in the dataset, RRSE offers a normalized assessment of the accuracy 
of the model. In relation to the variability in the dataset, a lower RRSE denotes better model performance, 
and values closer to zero suggest that the model's predictions agree with the actual values. RRSE offers 
important insights into the robustness and dependability of machine learning models by taking into account 
both the mean squared error and the variability of the data. 


2. METHOD 
2.1. Data collection 

We collect data from 6,780 liver disease patients who were coming to S’O’A University Medical 
Laboratory, Bhubaneswar for treatment by manually and using several implant IoT devices. These data are 
collected either using IoT sensors and manually. Some common IoT sensors of health care applications used 
for liver diseases symptom identification are: 

— Liver function tests (LFT) sensors: these could measure levels of enzymes, bilirubin and other 
substances indicative of liver function. 

— Wearable health monitors: devices that track vital signs such as heart rate, blood pressure, and activity 
levels can provide valuable health data. 

— Blood glucose monitors: monitoring glucose levels can be relevant, as liver health is interconnected 
with metabolic processes. 

We have taken the help of IoT framework from collection of data to storage of data at cloud, which 
can be only possible by use of different layers of IoT architecture [15], [16]. IoT framework takes a vital role 
to transimit the data from sensors to cloud in a secure path by help of its inbuilt IoT communication 
protocols. This helps to reduce the loss of important data in the middle of communication. 


2.2. Data cleaning and preparation 

Data cleansing comes first in the data preparation phase after a dataset has been gathered. In this 
essential stage, problems with the dataset that can prevent correct analysis and modeling are found and fixed. 
The central tendency measurements mean, mode and median are important in this data cleaning process. 


2.2.1. Handling missing values 

Prior to training a machine learning model, handling missing values in a dataset is important for a 
number of reasons. First off, when missing values are available, a lot of machine learning algorithms are 
unable to deal with them directly and may malfunction or yield incorrect results. Second, biased conclusions 
and predictions may result from the introduction of missing values into the model training process. 

Furthermore, the model's generalizability and forecast accuracy may suffer if missing variables are ignored. 

Furthermore, missing numbers have the potential to skew statistical calculations and analyses, which lowers 

the model's overall quality. As a result, filling in the missing values by methods like imputation or removal 

guarantees that the model is trained on correct and comprehensive data, which produces predictions that are 
more resilient. Following are different methods which we have used to handle missing values. 

— Mean imputation: if the dataset has missing numerical values, one common approach is to replace these 
missing values with the mean of the available data in the same column. This helps to preserve the 
overall distribution of the data. 

— Mode imputation: for categorical data, missing values can be replaced with the mode, which is the most 
frequent value in the column. 

— Median for robustness: when dealing with outliers (extreme values that can skew the analysis), the 
median is often preferred over the mean. The median is less sensitive to outliers, making it a more 
robust measure of central tendency. 

— Data validation: using these central tendency measures can also help identify potential errors or 
inconsistencies in the dataset. Extreme values that are far from the mean or median may warrant further 
investigation as potential data entry errors. 

— Impact on analysis: it's important to note that data cleaning decisions, such as imputing missing values 
or handling outliers, can impact the results of subsequent analyses or machine learning models. 
Therefore, these steps should be performed with careful consideration of the specific goals of the 
analysis. 
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3. RESEARCH DESIGN 

The data set used for this study was collected from 6,780 liver disease patients of S’O’A University 
Medical Laboratory, Bhubaneswar. In this study, various machine learning models were employed to predict 
the accuracy of the various stages of liver disease, such as stage | to 4 based on its complications. The 
process of our work is shown in Figure 3. 
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Figure 3. Flow of work 


3.1. Proposed boosting models 

The "boosting" ensemble modeling technique [17] seeks to build a strong classifier from a large 
number of weak classifiers. A model is built by stringing together weak models. A model is first built using 
the training data. Then, in an effort to correct the first model's flaws, the second model is built. This 
procedure is repeated until the maximum number of models has been added or until all of the training data set 
has been successfully predicted. In order to raise the accuracy of the final model, boosting might combine the 
accuracies of numerous weak models and average them for regression or vote over them for classification. In 
our experimental comparisons, we included 5 well-known strategies among the numerous types of boosting 
approaches. AdaBoost, GradientBoost, XGBoost, CatBoost and light gradient boosting model (LightGBM) 
are the techniques. 


3.2. Adaptive boosting 

AdaBoost is an ensemble learning technique that emphasises rectifying the errors of weak learners 
by giving extra weight to data points that have been incorrectly categorised. It creates a strong classifier by 
combining a number of weak classifiers, frequently decision trees. Each weak classifier is trained one at a 
time and samples that were incorrectly identified are given heavier weights in the following model. 
According to their performance, AdaBoost modifies the model weights, with more accurate models having a 
greater impact on the outcome of the prediction. 
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3.3. Gradient boosting 

Gradient boosting is a broad boosting framework that creates a group of decision trees to increase 
the accuracy of predictions. Gradient boosting, in contrast to AdaBoost, which focuses on data point weights, 
optimises the ensemble by minimising a loss function in relation to the predictions of the individual models. 
Decision trees are incrementally added to the ensemble and each tree employs gradient descent optimisation 
to try to rectify the flaws of the previous one. Gradient boosting is frequently used and popular examples 
include XGBoost, LightGBM and CatBoost. 


3.4. Extreme gradient boosting 

Gradient boosting is implemented effectively and scalably in XGBoost, which has been popular in 
both machine learning contests and practical applications. It manages missing data, employs regularisation 
techniques to avoid overfitting and supports parallel and distributed computing. Advanced features like early 
halting to avoid overtraining and intelligent handling of sparse data are also available with XGBoost. 


3.5. CatBoost 

CatBoost is a library for gradient boosting that was created specifically to support category features. 
It is practical for many real-world datasets since it automatically accommodates categorical variables without 
the need for explicit encoding. CatBoost offers robust performance right out of the box and has built-in 
mechanisms to minimise over fitting. 


3.6. Light gradient boosting model 

Another gradient-boosting library with a solid reputation for speed and effectiveness is LightGBM. 
Data is divided using a histogram-based method, which utilises less memory and hastens training. For 
parallel processing and huge datasets, LightGBM is a good choice. It offers different types of boosting, such 
as conventional gradient boosting, and RF. These boosting algorithms have special qualities that make them 
ideal for many types of data and tasks, making them effective instruments for increasing the accuracy of 
predictive modeling. 


4. RESULTS AND DISCUSSION 

The contents of Table 1 are representing the accuracy percentage which is received from the python 
simulation of above five machine learning models by taking the dataset. Here we divided the 6,780 samples 
of dataset into six numbers of classes as specified in Table 1 and then applied different machine learning 
models to get correct accuracy percentage. Figure 4 represents its corresponding graphical representation. 


Table 1. Comparison of NB, SMO, K-STAR, RF and MCC on the basis of correct accuracy 
Correct accuracy 
Class NB SMO  KSTAR RF MCC 
1-1130 85 91.06 73.86 88.75 90.62 
1130-2260 86.76 93.94 75.42 92.7 93.49 
2260-3390 84.99 89.95 75.96 87.91 89.33 
3390-4520 86.05 92.08 75.87 91.1 91.55 
4520-5650 86.23 93.58 77.82 92.7 93.23 
5650-6780 81.52 98.78 75.98 86.45 98.18 
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Figure 4. NB, SMO, K-STAR, RF and MCC on the basis of correct accuracy 
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The contents of Table 2 representing the MAEs received from the python simulation of above 
selected five machine learning models by taking the dataset and Figure 5 representing its graphical 
representation. The contents of Table 3 representing the RMSE received from the python simulation of above 
selected five machine learning models by taking the dataset and Figure 6 representing its graphical 
representation. The contents of Table 4 representing the RAE received from the python simulation of above 
selected five machine learning models by taking the dataset and Figure 7 representing its graphical 
representation. The contents of Table 5 representing the RRSE received from the python simulation of above 
selected five machine learning models by taking the dataset and Figure 8 representing its graphical 
representation. 


Table 2. Comparison of NB, SMO, K-STAR, RF and MCC on the basis of MAE 
MAE 
Class NB SMO K STAR RF MCC 
1-1130 0.3327 0.3195 0.3303 0.3265 0.3232 
1130-2260 0.3215 0.3149 0.3203 0.3159 0.3118 
2260-3390 0.324 0.3169 0.3224 0.3271 0.3232 
3390-4520 0.3268 0.3183 0.3211 0.3219 0.3212 
4520-5650 0.3324 0.3161 0.3136 0.3174 0.314 
5650-6780 0.3423 0.3226 0.3217 0.3325 0.331 
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Figure 5. Graphical comparison of NB, SMO, K-STAR, RF and MCC on the basis MAE 


Table 3. Comparison of NB, SMO, K-STAR, RF and MCC on the basis of RMSE 
RMSE 
Class NB SMO K STAR RF MCC 
1-1130 0.4243 0.4083 0.5396 0.4105 0.4052 
1130-2260 0.4202 0.4026 0.534 0.4031 0.3978 
2260-3390 0.4216 0.4052 0.5343 0.4109 = 0.4053 
3390-4520 0.4284 0.4069 0.5347 0.4053 0.4044 
4520-5650 0.4226 0.4041 0.5281 0.4029 0.3992 
5650-6780 0.4334 0.4121 0.5328 0.414 0.41 
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Figure 6. Graphical comparison of NB, SMO, K-STAR, RF and MCC on the basis of RMSE 
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Table 4. Comparison of NB, SMO, K-STAR, RF and MCC on the basis of RAE 


RAE 

Class NB SMO _ KSTAR RF MCC 

1-1130 103.08 99.01 102.34 101.17 100.15 
1130-2260 102.54 100.42 102.14 100.73 99.45 
2260-3390 100.13 97.94 99.62 100.09 99.88 
3390-4520 101.93 99.29 100.14 100.41 100.18 
4520-5650 105.21 100.04 99.28 100.48 99.4 
5650-6780 103.74 97.76 97.49 100.77 __100.3 
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Figure 7. Graphical Comparison of NB, SMO, K-STAR, RF and MCC on the basis of RAE 


Table 5. Comparison of NB, SMO, K-STAR, RF and MCC on the RRSE 


RRSE 

Class NB SMO _ KSTAR RF MCC 

1-1130 105.67 101.68 134.37 102.21 100.92 
1130-2260 106.15 101.72 134.91 101.85 100.5 
2260-3390 104.85 100.75 132.87 102.18 100.8 
3390-4520 107.03 101.66 133.58 101.26 101.03 
4520-5650 106.35 101.7 132.93 101.41 100.47 
5650-6780 106.73 101.47 131.19 101.94 101 
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Figure 8. Graphical comparison of NB, SMO, K-STAR, RF and MCC on the RRSE 


Tables 1 to 5 and its corresponding figures are used for the performance comparison [18]—[21] of all 
the above models. Correct accuracy, MAE, RAE, RMSE, and RRSE are five parameters of above machine 
learning models for measuring its performance. In machine learning models, the model which has highest 
accuracy and lowest error values is considered as best models. After examining all data of Tables 1 to 5 and 
its corresponding figures, we have found that SMO model is giving better results on this particular dataset as 
compared to other models. So, here we consider that SMO model is best model for prediction of liver daises 
stages. Then we have applied the machine learning boosting models with the selected SMO results. Table 6 
representing the accuracy percentage of different boosting models with collaboration to SMO model and 
Figure 9 showing its graphical representation. 
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Table 6. Accuracy level after adding boosting models with SMO 
Boosting with SMO 
AdaBoost 94.10% 
Gradient boost 96% 


XGBoost 90% 
CatBoost 92% 
LightGBM 94% 


E ADABOOST 
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g m XG boost 

= 100,00% 
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Figure 9. Accuracy after adding boosting models 


In this research work we have reviewed some recent existing works on prediction of liver disease, in 
which we found that, there are various research gaps regarding implementation of hybrid machine learning 
models. By implementation different machine learning boosting models on the base models we can enhance 
the disease prediction accuracy. Though in this research we have implemented the machine learning boosting 
models and found the results are remarkable as compared to other machine learning models [22], [23] that are 
proving the novelty of our work on prediction of liver disease. Table 7 showing the performance comparison 
between the existing works and our present work. 


Table 7. Result comparison with some existing work 


Author Year Machine learning models Accuracy 
Dritsas and Trigka [3] 2023 NB and LR 80.1% 
Singh et al. [24] 2020 LR, SMO, RF, NB, J48, and K-Nearest 72.50% 
Amin et al. [25] 2023 LR, RF, K-Nearest, SVM, MLP and voting models 91.40 
Our work 2023 NB, SMO, K-STAR, RF, MCC and machine learning boosting models 96% 


5. CONCLUSION 

This research uses a real time dataset made up of 6,780 records that were collected manually and 
using IoT sensor from the patients of S’O’A University Medical Laboratory to successfully forecast the 
stages of liver disease. To predict the phases of liver damage in patients, this study used a number of machine 
learning models, including NB, SMO, K-STAR, RF, and MCC. After careful simulation and analysis, it 
became clear that the SMO model performed better than the others and offered promising outcomes. But the 
search for better precision did not stop there. The SMO model's data was then subjected to analysis using a 
variety of machine learning boosting models, including AdaBoost, gradient boost, XGBoost, CatBoost and 
LightGBM. Then we get the result that, gradient boosting emerged as the most accurate and practical option 
in the end of the research, with an astounding accuracy rate of 96%. The ability of machine learning models, 
particularly gradient boosting, to support early detection and intervention for patients with liver disorders is 
demonstrated by this study, which represents a significant improvement in the field of liver disease stage 
prediction. This study paves the path for quicker and more precise medical interventions, which will 
eventually enhance patient outcomes and healthcare productivity. Future research will focus on bridging the 
gap between theoretical machine learning applications and real-world healthcare applications, which will 
ultimately enhance patient outcomes and increase the effectiveness of healthcare delivery. 
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